73 research outputs found

    A semi-parametric Bayesian model for unsupervised differential co-expression analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Differential co-expression analysis is an emerging strategy for characterizing disease related dysregulation of gene expression regulatory networks. Given pre-defined sets of biological samples, such analysis aims at identifying genes that are co-expressed in one, but not in the other set of samples.</p> <p>Results</p> <p>We developed a novel probabilistic framework for jointly uncovering contexts (i.e. groups of samples) with specific co-expression patterns, and groups of genes with different co-expression patterns across such contexts. In contrast to current clustering and bi-clustering procedures, the implicit similarity measure in this model used for grouping biological samples is based on the clustering structure of genes within each sample and not on traditional measures of gene expression level similarities. Within this framework, biological samples with widely discordant expression patterns can be placed in the same context as long as the co-clustering structure of genes is concordant within these samples. To the best of our knowledge, this is the first method to date for unsupervised differential co-expression analysis in this generality. When applied to the problem of identifying molecular subtypes of breast cancer, our method identified reproducible patterns of differential co-expression across several independent expression datasets. Sample groupings induced by these patterns were highly informative of the disease outcome. Expression patterns of differentially co-expressed genes provided new insights into the complex nature of the ER<it>α </it>regulatory network.</p> <p>Conclusions</p> <p>We demonstrated that the use of the co-clustering structure as the similarity measure in the unsupervised analysis of sample gene expression profiles provides valuable information about expression regulatory networks.</p

    COVID-19 susceptibility variants associate with blood clots, thrombophlebitis and circulatory diseases.

    Get PDF
    Epidemiological studies suggest that individuals with comorbid conditions including diabetes, chronic lung, inflammatory and vascular disease, are at higher risk of adverse COVID-19 outcomes. Genome-wide association studies have identified several loci associated with increased susceptibility and severity for COVID-19. However, it is not clear whether these associations are genetically determined or not. We used a Phenome-Wide Association (PheWAS) approach to investigate the role of genetically determined COVID-19 susceptibility on disease related outcomes. PheWAS analyses were performed in order to identify traits and diseases related to COVID-19 susceptibility and severity, evaluated through a predictive COVID-19 risk score. We utilised phenotypic data in up to 400,000 individuals from the UK Biobank, including Hospital Episode Statistics and General Practice data. We identified a spectrum of associations between both genetically determined COVID-19 susceptibility and severity with a number of traits. COVID-19 risk was associated with increased risk for phlebitis and thrombophlebitis (OR = 1.11, p = 5.36e-08). We also identified significant signals between COVID-19 susceptibility with blood clots in the leg (OR = 1.1, p = 1.66e-16) and with increased risk for blood clots in the lung (OR = 1.12, p = 1.45 e-10). Our study identifies significant association of genetically determined COVID-19 with increased blood clot events in leg and lungs. The reported associations between both COVID-19 susceptibility and severity and other diseases adds to the identification and stratification of individuals at increased risk, adverse outcomes and long-term effects

    Improving statistical inference on pathogen densities estimated by quantitative molecular methods: malaria gametocytaemia as a case study

    Get PDF
    BACKGROUND: Quantitative molecular methods (QMMs) such as quantitative real-time polymerase chain reaction (q-PCR), reverse-transcriptase PCR (qRT-PCR) and quantitative nucleic acid sequence-based amplification (QT-NASBA) are increasingly used to estimate pathogen density in a variety of clinical and epidemiological contexts. These methods are often classified as semi-quantitative, yet estimates of reliability or sensitivity are seldom reported. Here, a statistical framework is developed for assessing the reliability (uncertainty) of pathogen densities estimated using QMMs and the associated diagnostic sensitivity. The method is illustrated with quantification of Plasmodium falciparum gametocytaemia by QT-NASBA. RESULTS: The reliability of pathogen (e.g. gametocyte) densities, and the accompanying diagnostic sensitivity, estimated by two contrasting statistical calibration techniques, are compared; a traditional method and a mixed model Bayesian approach. The latter accounts for statistical dependence of QMM assays run under identical laboratory protocols and permits structural modelling of experimental measurements, allowing precision to vary with pathogen density. Traditional calibration cannot account for inter-assay variability arising from imperfect QMMs and generates estimates of pathogen density that have poor reliability, are variable among assays and inaccurately reflect diagnostic sensitivity. The Bayesian mixed model approach assimilates information from replica QMM assays, improving reliability and inter-assay homogeneity, providing an accurate appraisal of quantitative and diagnostic performance. CONCLUSIONS: Bayesian mixed model statistical calibration supersedes traditional techniques in the context of QMM-derived estimates of pathogen density, offering the potential to improve substantially the depth and quality of clinical and epidemiological inference for a wide variety of pathogens

    Quantitative PCR for Detection and Enumeration of Genetic Markers of Bovine Fecal Pollution

    Get PDF
    Accurate assessment of health risks associated with bovine (cattle) fecal pollution requires a reliable host-specific genetic marker and a rapid quantification method. We report the development of quantitative PCR assays for the detection of two recently described bovine feces-specific genetic markers and a method for the enumeration of these markers using a Markov chain Monte Carlo approach. Both assays exhibited a range of quantification from 25 to 2 × 106 copies of target DNA, with a coefficient of variation of <2.1%. One of these assays can be multiplexed with an internal amplification control to simultaneously detect the bovine-specific genetic target and presence of amplification inhibitors. The assays detected only cattle fecal specimens when tested against 204 fecal DNA extracts from 16 different animal species and also demonstrated a broad distribution among individual bovine samples (98 to 100%) collected from five geographically distinct locations. The abundance of each bovine-specific genetic marker was measured in 48 individual samples and compared to quantitative PCR-enumerated quantities of rRNA gene sequences representing total Bacteroidetes, Bacteroides thetaiotaomicron, and enterococci in the same specimens. Acceptable assay performance combined with the prevalence of DNA targets across different cattle populations provides experimental evidence that these quantitative assays will be useful in monitoring bovine fecal pollution in ambient waters

    A standardized framework for the validation and verification of clinical molecular genetic tests

    Get PDF
    The validation and verification of laboratory methods and procedures before their use in clinical testing is essential for providing a safe and useful service to clinicians and patients. This paper outlines the principles of validation and verification in the context of clinical human molecular genetic testing. We describe implementation processes, types of tests and their key validation components, and suggest some relevant statistical approaches that can be used by individual laboratories to ensure that tests are conducted to defined standards

    Multi-laboratory survey of qPCR enterococci analysis method performance in U.S. coastal and inland surface waters

    Get PDF
    Quantitative polymerase chain reaction (qPCR) has become a frequently used technique for quantifying enterococci in recreational surface waters, but there are several methodological options. Here we evaluated how three method permutations, type of mastermix, sample extract dilution and use of controls in results calculation, affect method reliability among multiple laboratories with respect to sample interference. Multiple samples from each of 22 sites representing an array of habitat types were analyzed using EPA Method 1611 and 1609 reagents with full strength and five-fold diluted extracts. The presence of interference was assessed three ways: using sample processing and PCR amplifications controls; consistency of results across extract dilutions; and relative recovery of target genes from spiked enterococci in water sample compared to control matrices with acceptable recovery defined as 50 to 200%. Method 1609, which is based on an environmental mastermix, was found to be superior to Method 1611, which is based on a universal mastermix. Method 1611 had over a 40% control assay failure rate with undiluted extracts and a 6% failure rate with diluted extracts. Method 1609 failed in only 11% and 3% of undiluted and diluted extracts analyses. Use of sample processing control assay results in the delta-delta Ct method for calculating relative target gene recoveries increased the number of acceptable recovery results. Delta-delta tended to bias recoveries from apparent partially inhibitory samples on the high side which could help in avoiding potential underestimates of enterococci - an important consideration in a public health context. Control assay and delta-delta recovery results were largely consistent across the range of habitats sampled, and among laboratories. The methodological option that best balanced acceptable estimated target gene recoveries with method sensitivity and avoidance of underestimated enterococci densities was Method 1609 without extract dilution and using the delta-delta calculation method. The applicability of this method can be extended by the analysis of diluted extracts to sites where interference is indicated but, particularly in these instances, should be confirmed by augmenting the control assays with analyses for target gene recoveries from spiked target organisms

    A Bayesian method for calculating real-time quantitative PCR calibration curves using absolute plasmid DNA standards

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In real-time quantitative PCR studies using absolute plasmid DNA standards, a calibration curve is developed to estimate an unknown DNA concentration. However, potential differences in the amplification performance of plasmid DNA compared to genomic DNA standards are often ignored in calibration calculations and in some cases impossible to characterize. A flexible statistical method that can account for uncertainty between plasmid and genomic DNA targets, replicate testing, and experiment-to-experiment variability is needed to estimate calibration curve parameters such as intercept and slope. Here we report the use of a Bayesian approach to generate calibration curves for the enumeration of target DNA from genomic DNA samples using absolute plasmid DNA standards.</p> <p>Results</p> <p>Instead of the two traditional methods (classical and inverse), a Monte Carlo Markov Chain (MCMC) estimation was used to generate single, master, and modified calibration curves. The mean and the percentiles of the posterior distribution were used as point and interval estimates of unknown parameters such as intercepts, slopes and DNA concentrations. The software WinBUGS was used to perform all simulations and to generate the posterior distributions of all the unknown parameters of interest.</p> <p>Conclusion</p> <p>The Bayesian approach defined in this study allowed for the estimation of DNA concentrations from environmental samples using absolute standard curves generated by real-time qPCR. The approach accounted for uncertainty from multiple sources such as experiment-to-experiment variation, variability between replicate measurements, as well as uncertainty introduced when employing calibration curves generated from absolute plasmid DNA standards.</p

    Standardized data quality acceptance criteria for a rapid Escherichia coli qPCR method (Draft Method C) for water quality monitoring at recreational beaches

    Get PDF
    There is growing interest in the application of rapid quantitative polymerase chain reaction (qPCR) and other PCR-based methods for recreational water quality monitoring and management programs. This interest has strengthened given the publication of U.S. Environmental Protection Agency (EPA)-validated qPCR methods for enterococci fecal indicator bacteria (FIB) and has extended to similar methods for Escherichia coli (E. coli) FIB. Implementation of qPCR-based methods in monitoring programs can be facilitated by confidence in the quality of the data produced by these methods. Data quality can be determined through the establishment of a series of specifications that should reflect good laboratory practice. Ideally, these specifications will also account for the typical variability of data coming from multiple users of the method. This study developed proposed standardized data quality acceptance criteria that were established for important calibration model parameters and/or controls from a new qPCR method for E. coli (EPA Draft Method C) based upon data that was generated by 21 laboratories. Each laboratory followed a standardized protocol utilizing the same prescribed reagents and reference and control materials. After removal of outliers, statistical modeling based on a hierarchical Bayesian method was used to establish metrics for assay standard curve slope, intercept and lower limit of quantification that included between-laboratory, replicate testing within laboratory, and random error variability. A nested analysis of variance (ANOVA) was used to establish metrics for calibrator/positive control, negative control, and replicate sample analysis data. These data acceptance criteria should help those who may evaluate the technical quality of future findings from the method, as well as those who might use the method in the future. Furthermore, these benchmarks and the approaches described for determining them may be helpful to method users seeking to establish comparable laboratory-specific criteria if changes in the reference and/or control materials must be made

    Evaluation of multiple laboratory performance and variability in analysis of recreational freshwaters by a rapid Escherichia coli qPCR method (Draft Method C)

    Get PDF
    There is interest in the application of rapid quantitative polymerase chain reaction (qPCR) methods for recreational freshwater quality monitoring of the fecal indicator bacteria Escherichia coli (E. coli). In this study we determined the performance of 21 laboratories in meeting proposed, standardized data quality acceptance (QA) criteria and the variability of target gene copy estimates from these laboratories in analyses of 18 shared surface water samples by a draft qPCR method developed by the U.S. Environmental Protection Agency (EPA) for E. coli. The participating laboratories ranged from academic and government laboratories with more extensive qPCR experience to “new” water quality and public health laboratories with relatively little previous experience in most cases. Failures to meet QA criteria for the method were observed in 24% of the total 376 test sample analyses. Of these failures, 39% came from two of the “new” laboratories. Likely factors contributing to QA failures included deviations in recommended procedures for the storage and preparation of reference and control materials. A master standard curve calibration model was also found to give lower overall variability in log10 target gene copy estimates than the delta-delta Ct (ΔΔCt) calibration model used in previous EPA qPCR methods. However, differences between the mean estimates from the two models were not significant and variability between laboratories was the greatest contributor to overall method variability in either case. Study findings demonstrate the technical feasibility of multiple laboratories implementing this or other qPCR water quality monitoring methods with similar data quality acceptance criteria but suggest that additional practice and/or assistance may be valuable, even for some more generally experienced qPCR laboratories. Special attention should be placed on providing and following explicit guidance on the preparation, storage and handling of reference and control materials
    corecore